This is the analysis by voivodeship. The main notebook of the whole analysis is located at Analysis.ipynb.
We will make use of the following libraries in our analysis.
import pandas as pd
import json
from IPython.display import display, Markdown
import plotly.express as px
We also import our own constants and functions.
from own_data import candidates, candidates_colors, poland_center, poland_zoom, map_margin, parties_2019, parties_2019_colors, \
opacity, parties_to_candidates
from utils import comma_to_dot, simplify_party
We read the csv files with the results by county in percent format. The data is taken from the website of the National Electoral Commission. Poland uses comma as a decimal separator, so we convert the data to dot-separated numbers.
results_voivodeships_percent_df = pd.read_csv('data/results/results_by_voivodeship_percent.csv', sep=';')
results_voivodeships_percent_df = results_voivodeships_percent_df[['Kod TERYT', 'Województwo'] + candidates]
for candidate in candidates:
results_voivodeships_percent_df[candidate] = results_voivodeships_percent_df[candidate].map(comma_to_dot)
results_voivodeships_percent_df.head()
Simultaneusly, we import the geographical data about borders of each voivodeship from the official data of the Head Office of Geodesy and Cartography. The webiste of GIS Support PL let us solely download the package with voivodeships. To create maps I will use GeoJSON format. The data from the websites mentioned before has the .shp extension, so I have formatted it to GeoJSON using MapShaper.
with open('data/geojson/voivodeships.json', encoding='utf-8') as response:
voivodeships = json.load(response)
voivodeships['features'][0]['properties']
The TERYT code is a unique code of each administrative unit. In the elections result the code has four extra 00. Additionally, it doesn't have a leading zero when its voivodeship number is only one digit. We are going to fix this issues to connect these two data sets.
def fix_teryt_voivodeship(teryt):
"""Fix TERYT code to integrate the two datasets for voivodeships."""
teryt = str(teryt)
if len(teryt) == 5:
teryt = '0' + teryt
return teryt[:-4]
results_voivodeships_percent_df['Kod TERYT'] = \
results_voivodeships_percent_df['Kod TERYT'].astype(str).map(fix_teryt_voivodeship)
results_voivodeships_percent_df.head()
This is the location of the key that will join our data sets in voivodeships JSON:
voivodeships['features'][0]['properties']['JPT_KOD_JE']
We plot the data on maps.
def get_figure_results_by_voivodeship(candidate):
"""Get figure showing a map of results of the given cadidate by voivodeship."""
candidate_df = results_voivodeships_percent_df[['Kod TERYT', 'Województwo', candidate]]
fig = px.choropleth_mapbox(
candidate_df, geojson=voivodeships, color=candidate,
locations='Kod TERYT', featureidkey="properties.JPT_KOD_JE",
center=poland_center,
opacity=opacity, color_continuous_scale=candidates_colors[candidate],
hover_data={'Województwo': True, 'Kod TERYT': False},
mapbox_style="carto-positron", zoom=poland_zoom
)
fig.update_layout(margin=map_margin)
return fig
for candidate in candidates:
display(Markdown(f'### Results of {candidate} by voivodeship'))
get_figure_results_by_voivodeship(candidate).show()
winners_voivodeships_df = pd.concat([
results_voivodeships_percent_df[candidates].idxmax(axis=1).rename('Winner').to_frame(),
results_voivodeships_percent_df[candidates].max(axis=1).rename('Result').to_frame(),
results_voivodeships_percent_df[['Województwo', 'Kod TERYT']]
], axis=1)
winners_voivodeships_df.head(1)
winners_voivodeships_fig = px.choropleth_mapbox(
winners_voivodeships_df, geojson=voivodeships, color='Winner',
locations='Kod TERYT', featureidkey="properties.JPT_KOD_JE",
center=poland_center,
opacity=opacity, color_discrete_sequence=px.colors.qualitative.D3,
hover_data={'Województwo': True, 'Kod TERYT': False, 'Result': True},
mapbox_style="carto-positron", zoom=poland_zoom
)
winners_voivodeships_fig.update_layout(margin=map_margin)
winners_voivodeships_fig.show()
In the presidential elections 2020 the results are:
When we compare these results with the ones from parliamentary elections in 2019, we can see that they are quite similar. In 2019, the parties of these candidates got respectively:
We try to compare the results of these two elections and check if the preferences of the voters in voivodeships has changed.
We begin with parsing the data of the 2019 voting for Sejm lists from the National Electoral Commission.
results_voivodeships_percent_2019_df = pd.read_csv('data/results/results_by_voivodeship_percent_2019.csv', sep=';')
results_voivodeships_percent_2019_df = results_voivodeships_percent_2019_df[['Kod TERYT', 'Województwo'] + parties_2019]
for party in parties_2019:
results_voivodeships_percent_2019_df[party] = results_voivodeships_percent_2019_df[party].map(comma_to_dot)
results_voivodeships_percent_2019_df.head()
The geografical data is already parsed.
Fix the TERYT code.
results_voivodeships_percent_2019_df['Kod TERYT'] = \
results_voivodeships_percent_2019_df['Kod TERYT'].astype(str).map(fix_teryt_voivodeship)
results_voivodeships_percent_2019_df.head()
def get_figure_results_by_voivodeship_2019(party):
"""Get figure showing a map of results of the given party by voivodeship."""
party_df = results_voivodeships_percent_2019_df[['Kod TERYT', 'Województwo', party]]
party_df.columns = party_df.columns.to_series().apply(simplify_party)
simplified_party = simplify_party(party)
fig = px.choropleth_mapbox(
party_df, geojson=voivodeships, color=simplified_party,
locations='Kod TERYT', featureidkey="properties.JPT_KOD_JE",
center=poland_center,
opacity=opacity, color_continuous_scale=parties_2019_colors[party],
hover_data={'Województwo': True, 'Kod TERYT': False},
mapbox_style="carto-positron", zoom=poland_zoom
)
fig.update_layout(margin=map_margin)
return fig
for party in parties_2019:
simplified_party = simplify_party(party)
display(Markdown(f'### Results of {simplified_party} in 2019 by voivodeship'))
get_figure_results_by_voivodeship_2019(party).show()
candidate = parties_to_candidates[simplified_party]
display(Markdown(f"### Results of the {simplified_party}'s candidate in 2020 by voivodeship"))
get_figure_results_by_voivodeship(candidate).show()
We can observe that the support for Andrzej Duda and Prawo i Sprawiedliwość party is almost the same. In case of Rafał Trzaskowski and Platforma Obywatelska there are slight differences in the western part of Poland.